Solar forecasting from ground-based sky images using deep learning models has shown great promise in reducing the uncertainty in solar power generation. One of the biggest challenges for training deep learning models is the availability of labeled datasets. With more and more sky image datasets open sourced in recent years, the development of accurate and reliable solar forecasting methods has seen a huge growth in potential. In this study, we explore three different training strategies for deep-learning-based solar forecasting models by leveraging three heterogeneous datasets collected around the world with drastically different climate patterns. Specifically, we compare the performance of models trained individually based on local datasets (local models) and models trained jointly based on the fusion of multiple datasets from different locations (global models), and we further examine the knowledge transfer from pre-trained solar forecasting models to a new dataset of interest (transfer learning models). The results suggest that the local models work well when deployed locally, but significant errors are observed for the scale of the prediction when applied offsite. The global model can adapt well to individual locations, while the possible increase in training efforts need to be taken into account. Pre-training models on a large and diversified source dataset and transferring to a local target dataset generally achieves superior performance over the other two training strategies. Transfer learning brings the most benefits when there are limited local data. With 80% less training data, it can achieve 1% improvement over the local baseline model trained using the entire dataset. Therefore, we call on the efforts from the solar forecasting community to contribute to a global dataset containing a massive amount of imagery and displaying diversified samples with a range of sky conditions.
translated by 谷歌翻译
图表自学学习(GSSL)铺平了无需专家注释的学习图嵌入的方式,这对分子图特别有影响,因为可能的分子数量很大,并且标签昂贵。但是,通过设计,GSSL方法没有经过训练,可以在一个下游任务上表现良好,而是旨在将其转移到许多人方面,从而使评估不那么直接。作为获得具有多种多样且可解释属性的分子图嵌入曲线的一步,我们引入了分子图表示评估(Molgrapheval),这是一组探针任务,分为(i)拓扑 - ,(ii)子结构 - 和(iii)和(iii)嵌入空间属性。通过对现有下游数据集和Molgrapheval上的现有GSSL方法进行基准测试,我们发现单独从现有数据集中得出的结论与更细粒度的探测之间存在令人惊讶的差异,这表明当前的评估协议没有提供整个图片。我们的模块化,自动化的端到端GSSL管道代码将在接受后发布,包括标准化的图形加载,实验管理和嵌入评估。
translated by 谷歌翻译
将间歇性可再生能源集成到大量的电网中是具有挑战性的。旨在解决这一困难的建立良好的方法涉及即将到来的能源供应可变性以适应电网的响应。在太阳能中,可以在全天空摄像机(前方30分钟)和卫星观测(提前6小时)的不同时间尺度上预测由遮挡云引起的短期变化。在这项研究中,我们将这两种互补的观点集成到单个机器学习框架中的云覆盖物上,以改善时间内(最高60分钟)的辐照度预测。确定性和概率预测均在不同的天气条件(晴朗,多云,阴天)以及不同的输入配置(天空图像,卫星观测和/或过去的辐照度值)中进行评估。我们的结果表明,混合模型在晴朗的条件下有益于预测,并改善了长期预测。这项研究为将来的新颖方法奠定了基础,即在单个学习框架中将天空图像和卫星观测结合起来,以推动太阳现象。
translated by 谷歌翻译
汇集操作引起的翻译不变性是卷积神经网络的固有属性,这有助于诸如分类的许多计算机视觉任务。然而,为了利用旋转不变的任务,卷积架构需要特定的旋转不变层或广泛的数据增强,以从给定空间配置的不同旋转版本中学习。将图像展开到其极性坐标中提供了更明显的表示,以训练卷积架构,因为旋转不变性变为平移,因此可以从单个图像中学习给定场景的视觉上不同但其他等同的旋转版本。我们展示了两个基于视觉的太阳辐照性预测挑战(即使用地面拍摄的天空图像或卫星图像),即该预处理步骤通过标准化场景表示来显着提高预测结果,同时将培训时间减少4倍4倍。使用旋转增强数据。此外,该变换放大了围绕旋转中心的区域,导致更准确的短期辐照度预测。
translated by 谷歌翻译
人工智能(AI)为简化Covid-19诊断提供了有前景的替代。然而,涉及周围的安全和可信度的担忧阻碍了大规模代表性的医学数据,对临床实践中训练广泛的模型造成了相当大的挑战。为了解决这个问题,我们启动了统一的CT-Covid AI诊断计划(UCADI),其中AI模型可以在没有数据共享的联合学习框架(FL)下在每个主机机构下分发和独立地在没有数据共享的情况下在每个主机机构上执行。在这里,我们认为我们的FL模型通过大的产量(中国测试敏感性/特异性:0.973 / 0.951,英国:0.730 / 0.942),与专业放射科医师的面板实现可比性表现。我们进一步评估了持有的模型(从另外两家医院收集,留出FL)和异构(用造影材料获取)数据,提供了模型所做的决策的视觉解释,并分析了模型之间的权衡联邦培训过程中的性能和沟通成本。我们的研究基于来自位于中国和英国的23家医院的3,336名患者的9,573次胸部计算断层扫描扫描(CTS)。统称,我们的工作提出了利用联邦学习的潜在保留了数字健康的前景。
translated by 谷歌翻译
太阳能的高效整合到电力组合中取决于其间歇性的可靠预期。预测由云覆盖动态产生的太阳辐照度的时间变异的有希望的方法是基于地面天空图像或卫星图像序列的分析。尽管结果令人鼓舞,但现有深度学习方法的经常性限制在于对过去观察的反应而不是积极预期未来事件的无处不在的趋势。这导致频繁的时间滞后和有限的预测突发事件的能力。为了解决这一挑战,我们介绍了Eclipse,一种时空神经网络架构,即模型从天空图像模拟云运动,不仅预测未来的辐照水平,而且还可以在本地辐照度图上提供更丰富的信息。我们表明Eclipse预期关键事件,并在产生视觉上现实期货的同时降低时间延误。
translated by 谷歌翻译
As AI systems become more capable, we would like to enlist their help to supervise other AIs. We experiment with methods for training a harmless AI assistant through self-improvement, without any human labels identifying harmful outputs. The only human oversight is provided through a list of rules or principles, and so we refer to the method as 'Constitutional AI'. The process involves both a supervised learning and a reinforcement learning phase. In the supervised phase we sample from an initial model, then generate self-critiques and revisions, and then finetune the original model on revised responses. In the RL phase, we sample from the finetuned model, use a model to evaluate which of the two samples is better, and then train a preference model from this dataset of AI preferences. We then train with RL using the preference model as the reward signal, i.e. we use 'RL from AI Feedback' (RLAIF). As a result we are able to train a harmless but non-evasive AI assistant that engages with harmful queries by explaining its objections to them. Both the SL and RL methods can leverage chain-of-thought style reasoning to improve the human-judged performance and transparency of AI decision making. These methods make it possible to control AI behavior more precisely and with far fewer human labels.
translated by 谷歌翻译
Training large, deep neural networks to convergence can be prohibitively expensive. As a result, often only a small selection of popular, dense models are reused across different contexts and tasks. Increasingly, sparsely activated models, which seek to decouple model size from computation costs, are becoming an attractive alternative to dense models. Although more efficient in terms of quality and computation cost, sparse models remain data-hungry and costly to train from scratch in the large scale regime. In this work, we propose sparse upcycling -- a simple way to reuse sunk training costs by initializing a sparsely activated Mixture-of-Experts model from a dense checkpoint. We show that sparsely upcycled T5 Base, Large, and XL language models and Vision Transformer Base and Large models, respectively, significantly outperform their dense counterparts on SuperGLUE and ImageNet, using only ~50% of the initial dense pretraining sunk cost. The upcycled models also outperform sparse models trained from scratch on 100% of the initial dense pretraining computation budget.
translated by 谷歌翻译
Synaptic plasticity allows cortical circuits to learn new tasks and to adapt to changing environments. How do cortical circuits use plasticity to acquire functions such as decision-making or working memory? Neurons are connected in complex ways, forming recurrent neural networks, and learning modifies the strength of their connections. Moreover, neurons communicate emitting brief discrete electric signals. Here we describe how to train recurrent neural networks in tasks like those used to train animals in neuroscience laboratories, and how computations emerge in the trained networks. Surprisingly, artificial networks and real brains can use similar computational strategies.
translated by 谷歌翻译
Generative Adversarial Networks (GANs) were introduced by Goodfellow in 2014, and since then have become popular for constructing generative artificial intelligence models. However, the drawbacks of such networks are numerous, like their longer training times, their sensitivity to hyperparameter tuning, several types of loss and optimization functions and other difficulties like mode collapse. Current applications of GANs include generating photo-realistic human faces, animals and objects. However, I wanted to explore the artistic ability of GANs in more detail, by using existing models and learning from them. This dissertation covers the basics of neural networks and works its way up to the particular aspects of GANs, together with experimentation and modification of existing available models, from least complex to most. The intention is to see if state of the art GANs (specifically StyleGAN2) can generate album art covers and if it is possible to tailor them by genre. This was attempted by first familiarizing myself with 3 existing GANs architectures, including the state of the art StyleGAN2. The StyleGAN2 code was used to train a model with a dataset containing 80K album cover images, then used to style images by picking curated images and mixing their styles.
translated by 谷歌翻译